Efficiently Compiling Efficient Query Plans for Modern Hardware

نویسنده

  • Thomas Neumann
چکیده

As main memory grows, query performance is more and more determined by the raw CPU costs of query processing itself. The classical iterator style query processing technique is very simple and flexible, but shows poor performance on modern CPUs due to lack of locality and frequent instruction mispredictions. Several techniques like batch oriented processing or vectorized tuple processing have been proposed in the past to improve this situation, but even these techniques are frequently out-performed by hand-written execution plans. In this work we present a novel compilation strategy that translates a query into compact and efficient machine code using the LLVM compiler framework. By aiming at good code and data locality and predictable branch layout the resulting code frequently rivals the performance of handwritten C++ code. We integrated these techniques into the HyPer main memory database system and show that this results in excellent query performance while requiring only modest compilation time.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-level Parallel Query Execution Framework for CPU and GPU

Recent developments have shown that classic database query execution techniques, such as the iterator model, are no longer optimal to leverage the features of modern hardware architectures. This is especially true for massive parallel architectures, such as many-core processors and GPUs. Here, the processing of single tuples in one step is not enough work to utilize the hardware resources and t...

متن کامل

A Cost Model for Data Stream Processing on Modern Hardware

For stream processing application domains, using queries to process or analyze data incoming from potentially endless streams, low latency and high throughput are key requirements. It is not easy to achieve this as many factors influence the actual runtime of query execution plans and one can not measure all of them individually. Therefore, query optimizers try to overcome this hurdle by using ...

متن کامل

Radish: Compiling Efficient Query Plans for Distributed Shared Memory

We present Radish, a query compiler that generates distributed programs. Recent efforts have shown that compiling queries to machine code for a single-core can remove iterator and control overhead for significant performance gains. So far, systems that generate distributed programs only compile plans for single processors and stitch them together with messaging. In this paper, we describe an ap...

متن کامل

Compiling queries for high-performance computing

Data-intensive applications motivate the integration of highproductivity query languages with high-performance computing runtimes. We present a technique Compiled parallel pipelines (CPP) for compiling relational query plans to programs suitable for high-performance computing platforms. Rather than compose a sequential query compiler with a high-performance communication library like MPI, we ta...

متن کامل

Efficient Query Processing on Modern Hardware

Most database systems translate a given query into an expression in a (physical) algebra, and then start evaluating this algebraic expression to produce the query result. The traditional way to execute these algebraic plans is the iterator model: Every physical algebraic operator conceptually produces a tuple stream from its input, and allows for iterating over this tuple stream. This is a very...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • PVLDB

دوره 4  شماره 

صفحات  -

تاریخ انتشار 2011